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the human genome, this approach wilt become more wtdespreBcl paitLcularly for lo« 
caHzadon and gene identification. 

Allelic association refers to a significantly increased or decreased frequency of a 
marker allele with a disease trait and represents deviations from the random occur- 
rence of the alleles with respect to disease phenotype. Allelic association can be due 
to cither linkage or association. We use linkage disequilibrium to mean allelic asso- 
ciation maintained by tight linkage. Linkage disequilibrium occurs when a paiticu- 
iar marker allele lies so close to the disease susceptibility allele that tiiese alleles 
will be inherited together over many generations. Thus the same allele will be de- 
tected in affected individuals in multiple apparently unrelated families. Conceptual- 
ly this is the same aa standard linkage analyais, excopt that the recombination 
distances being measured are now very small (generally < I cM), and the recombi- 
nation events can only be infenned based on the level of sharing of the same allele. 
Population substructure most often occurs with the recent admixture of populations. 
In the case of population substructure, alleles may show a statistical association 
simply by chance due to dificfenoes in allele frequencies b the two miidng popula- 
tions. This can occor even when fhere is no biological association or tnie genetic 
linkage. 



LINKAGE DISEQUIUBRIUM 

lb understand one way In which linkage disequilibrium can come about, consider 
this hypothetical example. A new mutation occurs in a gene that results in a disease- 
causing phenotype. At the time of the initial mutation, every marker allele for every 
maiker on the chromosome is "associated" with the disease mutation. The chromo- 
some with the disease mutation is then transmitted to the offering of the original 
individual in whom the mutation occurred. IVansmisslon over several generations 
gives the opportunity for recombination to occur end ihus for die rearrangement of ' 
the alleles at the marker loci. Alleles at marker loci that are further away from the 
disease mutation will exchange faster than markers that are closer to the disease 
mutation. The closer to the marker is to 0ie disease gene« the longer die marko' al- 
lele/disease association will persist 

When marker and disease loci are very close togedier on a chromosome, genetic 
crossing over will have occurred at such a low rate that the marker will appear to 
cosegregate with the gene regardless of the family studied. This is in contrast to the 
situation of two loci further apart but still linked, in which case repeated cioasing 
over will allow all possible combinations of chromosomal haplotypes to appear 
with ftequeneiea. aa predicted by the equation for Kardy-Weinbeig equilibrium 
(Chapter 2). Thus, liidcage disequilibrium can be very usefiil in definil^ the anoes- • 
tral haplotypc of a disease gene m relation to several maiker loci; it can be used fer ; 
fine-mapping of the disease gene even when coniplcte linkage (B = 0.0) is estab^ . 
lisbed in the familiea being studied. 

There are many measurements of linkage disequilibrium (Devlin and Risebt 
1995). The most commonly used is the disequilibrium coefficient Z). 
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where /»n is the observed frequency of the I/l haplatype,p, is flic frcquen<y of the 
"I" allele at locus 1 in the general pcpulatipo and is the population frequency of 
the T allele at locus 2. Generally, the allele at each locua is defined as the 
most common of tiia alleles at that locus. Because we assign "1" as the most com- 
mon allele, tfao coefilcient D ranges from -0.25 to 0,25. Pbsitive values of O Indi- 
cste that the common alleles at each locus segregate together, Negative values Indi- 
cate that the common allele at one locus segregates with the rare allele at the other 
locus. The fate of decay of linkage disequilibrium is dqjcndent on the distance be- 
tween loci: 

where / is the current generation number, JD, is the current amount of disequilibri- 
um, Z7o is the disequilibrium at generadon 0, and 6 is the Rcombination ftaction be- 
tween locL 

Allelic association due to population admixture, selection, or genetic drift be- 
tween unlinked loci will decty fairly rapidly in comparison to linkage disequiHbri- 
urn between tightly linked genetic loci, and thus is a short-term phenomenon 4al 
wiU be almost Impossible to detect in the typical study. However linkage disequilib- 
rium will decay rather slowly, d^endent primarily on the recombination distance 
between the maricers and the number of generations that has passed since the initial 
event (Fig, 15.1). The slowness of linkage disequiUbrium dec^ makes this a usefUl 
mapping tool. 

Hie genera] rule of thumb is that the stranger flie disequilibrium, the closer the 
rtaricer is to the disesse locus. This is not always tiie case, however, for several rea- 
sons. Firsi. the fiequencies of the maricet alleles have an impact on die power to do- 
tect linkage disequilibrium. For example, if the disease susceptlblUy allele is asso- 
ciated with a marker allele v^ose general population allele liwjuency is 0.50. an 
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observation tliat this marker allele frequency is OM b affected individuals repre- 
sents only a 60% increase in the frequency. However, if the marker allele population 
frequency is only O.ZO» an observed 0.80 frequency Is a 400% increase. Mutation 
nites at the marker locus also affecc disequilibrium by increasing the chance that the 
associated marker allele will change and so seem to be leprBsenKnig a diflfeient 
chromosome. 

Population bottlenecks, where the effective population size is reduced to a very 
small number for a period of time before the population size hdeases again, csn 
create or reinforce an existing association. This is done by the random loss to the 
genome pool of most chromosomes carrying the guscsptibility allele; v4iat ramaios 
may have existed in only one individual who survived the bi^eaeck. Chance loss 
of suficeptibility-allele-bearing chromosomes (random genetic dtift) can also gener- 
ate linkage diacquilibrium. Two phenomena that can complicate the analysis of al- 
lelic association are selection in favor of a particular pheootype and new mutations 
, (at cither &e disease or marker iocO arising in the popubCtoo. 

MAPPING GENES USING UNKAGE DISEGUIUBRIUM 
ANO SPECIAL POPULAHONS 

In most cases^ allelic association will result from linkage disequilibrium, unless a 
specific polymocphism in the actual susceptibility gene is being studied^^^The power 
of linkage disequilibrium is best exploited in its use in fine-mapping. Because link- 
age disequilibrium rarely extends more than 1 cM from iiie susceptibility locus, its 
detection signals a significant decrease in the tninimum candidate region. However, 
this great strength is also its great drawback. Because the eiiect is SO localize^ it 
will be very hard to find against the badcground of the entire human genome. In ge- 
netically complex diseases, Oie further complicatiotts of genetic heterogeneity 
and/or gene-gene interaction may make detection even more difficult 

Mappmg and/gr gene identification using linkage disequilibrium is especially 
powerfril in genetically imique or Isolated populations (so-called ^cial popula- 
tions, such as Amish or Finnish populations). These populations have already been 
used successfrdly for Mcndelian diseases, since th^ an often homogeneous in dis- 
ease origin. In other words, there arc likely to be only a few fbundmg Individuals 
who canied specific chromosomal haplotypes on which the original mutation oc- 
curred. If the population is relatively new (i.e., the result of recent admixture of two 
populations), this approach can also be useful m the genecal mapping of disease loci 
as welL as has been diown in tare recessive disorders (Hastbacka et al., 1992). This 
is possible because the linkage disequilibrium is likely to extend over larger areas 
(several centimozgans) of the chromosome, since the number of generations avail- 
able to allow decay of the linkage disequilibrium by recombination is small. 

Those populations can be equally useful in mapping complex traits. The basic 
premise is that genetically isolated populations wiU have fewer geoca contribttthig 
toward a disease trait, and therefore the effect of each remainhig aosc^tibility gene 
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will be easier to detect. The value of these populations in genetic mapping atudies 
has long been realized. Hamozygosify mapping vna described eariy on in the mole- 
cular revolution (Lander and Botstcin» 1987). Recently these advances have been 
expanded to inchide the use of pooling strategies (Sheffield et al.| 1 994) and the ex- 
ploitadon of the phenomena of linkage diaequilibrium together with the isolated in- 
bred nature of these groups through the approach of "shared segment" mapping of a 
complex phenotype (Houwen et al., 1994; Duiiiem and Felngold, 1997). Thus the 
great advantage of the apecial population is its power to detect linkage. However, it 
must be pointed out that this power comes at the potential cost of specificity: Only 
one or a few of die entire suite of susceptibility genes may be founds and the effect 
of this gene or genes may be limited to the apedal population being studied. 

ASSOCIATION STUDIES: IMPLEMENTAT0N 

There are two types of association studies. Case^ntrol studies compare allele fre- 
quencies in a set of unrelated affected individuals to a set of matched controls. Tie 
control populations should be maldied with respect to ethnicity as well as other fac- 
tots such as age. Spurious associations can result because of population stratifica- 
tion (i.e.. the existence of multiple population subtypes in what is assumed to be a 
relatively homogeneous papulation). Such stratification can n^resent cither leeent 
admixture or the incoirect matching of cases and oontrels. the existence of diese 
confounding factors can lead to a significant result even in unlinked |oci (Jihle 
I S, 1) or unaasoQiated loci within stratum. 

An example of a case-control linkage stody is Oie identification of the APO&^ 
allele as the susceptibility gene in latc-onset familial and spoiadio AD. Tkble 15 2 
presents data on a laxge study of over 500 Alzheimer disease patients end age and 
ethnically matehed eontrek Using standard chi-square analysis, a significant asso- 
ciation was found Ip < 0,001]. As indicated in TaUe 15.2. there appears to be an in- 
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Table 154 Ce8«-Conlrel AasoeMon nudlM AP0&4 Allele and 



a. Observed Counts 


APOE-4 allele 




Conttda 


Total 




240 


60 


300 


NotAPOE-4 


360 


340 
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Total 


600 


400 


idoo 


b. ExpeefedCeuRt« 


AP0&4 allele 
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Total 


APOE-4 
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NotAP0&4 


420 


280 
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Total 


600 


400 


1000 



- 42fllV420} ♦ 1340 - 28Q)*/Z8q)) s 71 A P < 0.0001 . 



croase in the AFOB^ allele, and a concomitant decrease In APOE-S allele, in 
Al^eimer patients. 

FamUyiased studies control ibr Oie poMibiligr of genetic differences between 
the case and control populationJ by comparing titc frequencies of alleles transmitted 
to the affected child to the alleles not transmitted. The only samples necessao^ are 
those Ihan the aflbcted individual and his or her two parents (the IDT triad), this 
approach eliminates die concern that population suVstructure may be the cause of 
the association. These studies hiclude the transmission disequilibrium test (TDT) 
(Spiclman et al., 1993), the haplotype relative risk test (HRR) (Falk snd Rubenstein, 
1987), and the AFBAC method (Thomson, 1995). The HRR and AFBAC approach- 
es were developed as famifar-baaed testa for association, and dxe AFBAC method is 
designed to detect association in the presence of linkage. The TDT approach tests 
for linkage in the presence of association. Both AFBAC and TDT have Me power 
unless linkage and association coexist The diflfcrcncc between these two methods is 
that the TDT can also function as a test of association in the presence of population 
admixture and can be used as a valid test of linkage. The statistical difTerences be- 
tween these methods are subtle and are not described here. 

The TDT is the most widely used of all the tests. It can be a more powerfiil lest 
to detect linkage than the affected sib pair linkage tests, especially when the ge- 
netic effect u small (Spiclman et al., 1993; Risch and Mcrikangas, 1996), as U of- 
ten the case with genetically complex traits, The disadvantage of this qjproach is 
that the TDT has no power to detect linkage if association is not present. The TDT 
test examines die number of ttansmisaiona of allele I (AJ) or MelellAD from a 
hetere^gous patent to an affected offspring. An example of die TDT is given in 
Figure 15.2. As a test of linkage the counts needed in Figure 15.2 can come from 
simplex, multiplex or even multigcnenrttonal family data. The statistical signifi- 
cance Is tested by standacd chi-square (McNamarls test). The data used in con- 
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Example of the TDT 
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fftrueting the counts comes only from heterozygous paients. The TDT statistic is as 
follows: 

^ fc + c 

where b is the ttumber of times an AI/A2 pareot transmits an ^/ to an affected off- 
spring and c is the number of times an AI/A2 parent transmits an i<2 to an affected 
oflf^priog. 

The test statistic then tests for deviations from the expected equal transmission 
rate into the two categories from the heterozygous parents. Homozygous parents 
do not need to be scored. Hiis is different from the AFBaC method, which uses 
both hetenuygous and homozygous parents. A significant result faidicates that 
the marker is linked to the disease locus. The TDT can find linkage only in the 
presence of association. If diere were only linkage and no linkage disequilibrium, 
then across families there would be no difference between b and shice the allele 
in eouplini with the disease gene in each family is random, preventing the detec- 
tion of linkage. The TDT can also serve as a test of linkage disequilibrium if only 
simplex families are used or if only one affected individual and his or her parents 
are included per family. This use of the TDT is critical for narrowing a broad re* 
gion of interest identified by linkage analysis. Analysis of association could po- 
tentially identify the markers that are closest to the actual disease ause^tibility lo- 
eus. 

The TDT approach was a useful approach m identifying the relationship of the 
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insulin gene in insulin-dependent diabetes mellitus (IDDM) (SpiefaiiMt et ol 
1993) fIW,fe 15.3). Hie TDT method does not nquK. RKudwn to go to the 
p«i»« Bid eflTort of rccnihing fiunilics wilfa multiple afiected indiWduols. indeed 
flie clinical status of fiie parents dees not have to be icnown. In some cases how- 
ever, these advantages may be outweighed ly practical pnUeais. In the diabetes 
Bination. both parents of the afiscted ftrailiy numbeis weie Mailable for study 
making this approach ideally suited to IDDM. Howevfer. in the Akbeimei/ 
AP0E^4 example, riniee AD is a laco-onset disease, psfental DNA is almost always 
unavailable, and thus the traditional TDT approach was impossible. A novel ap- 
proach Aat eireumvonta this difficulty is to use unafibcted aiblings as contnis 
mther than relying on parental controls (Curtia. 1997; Boehnkc and Langefeld. 
1998; Spielman and Ewsns. 1998). TWs sib-TDT (S-TDl) approach compares 
matter allele fiequeacies in ailected and unaffected siblings. The test requires 
only a simple aflfected/unafTectwl sibling pair, although pwwr can be increased 
somewhat if additional siblings are available. Iho S-TOT retains the advantages of 
Um TDT in that it provides a test for linkage and association and is immune to the 
effects of sampling bias. 

Hic TDT test was originally developed to look at biallelic maiker systems or sit- 
uations of aUeles thst could be readily collapsed because of prior iofonnatkm re- 
garding a known association. With the availability of a multitude of multiallelfe 
markers, several new statistics have been proposed. These inchided the symmetty 
statistic (rj. the marginal statistic (rj of Biekeboller and aetget-Darpoux (1 995). 
the likelihood ratio statistic (r,) of Sham and Curtis (1995). and the maiginal statis- 
heterozygous parents {T^ of Spielman and Ewens (199fi). Kaplan ei 
«1. (1997), investigated the properties of diese fwc statistics and deterained that the 
T^bd >»8s the most ^prppiiate and efficient metiiod. It had oqolvalent power to the 
other testa and it gave a (appmdmaiely) valid cU-squaie test of linkage. TTieir icc 
ommendation for muitiallelic markers was to implement tiie mmg critical val- 
ues of x» with (m - 1} m number of alleles at die raaiker locus) df and tncluding 
all available affected individuals and their hetetozygoie parents in the analysis, 
lable 15.4 provides the definition of die statistic. 
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Table 1M HM^Hc TOT: T^- 
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THE USE OP THE TDT IN QENOMC SCREENS 

The TDT was ongifidly praised t test for linkage to ipecific candidate loci. 
More recently discussiooa have centered on the use of the TDT spproach for entire 
genome scans (Risch and Merikaagas, 1996). The ability to replicate in a compara- 
ble data set.is as critlcaUy important ivheo one is using the TDT as it is to the per- 
formance of linkage smdies using aiffected relative pair analysis. With the advent of 
microchip technology on the horisEon, and the interest in the development of ances- 
trel single nucleotide repeat polymoiphJsms (SNPs), the esqyandod use of this 
methodology to quickly map and identify genes is a cortainty. % 

However, there are several potential problems with the TDT in genomic scan- 
ning. First, the problem of multiple comparisons arises in tbis situation. That is, 
when so many statistical tests are perfonncd false positive results arc likely by 
ehance done unless the usual signlficanoe value of 0.05 or 0.01 is modified. Thus 
use of a critical value that is greater than the nommal P value is warrat^ The usu- 
al Bonferroni correction approach (sln^ply dividing the desired nominal signifi- 
cance level by the number of tests performed) will be too conservative because it as- 
sumes each test to be independent of che others. This will not be the case, since 
many of the markers will be linked and associated with each other. Unfortunately, h 
is not clear what the appropriate coireetion needs to be, alOiough simulation-based 
statistics are being explored. 

The second problem is simply the number of polymorphic markers necessary. 
Even at one marker per cennmofgan, over 3000 well-mappod markers are needed. 
In addition, these markers must actually be located at I cM distances, not just scat- 
tered with a I cM average distance. While many more than 3000 markers exist, 
there are still many regions of the genome up to 20 cM in length with no known 
polymorphisms. 

Hie third problem is that the TDT approach rests completely on tho assumption 
that some level of ludcage disequiHbrium exists. While this m^ be true in many 
cases, auscqitibllity alleles arising from fiequent mutation events, existing aa ex- 
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trcmely old mutations, or arising in regions with very high recombination rates, will 
have little or any detectable linkage disequilibrium. 

SUMMARY 

Allelic association may arise for several reasons, but association due to linkage dis- 
equilibrium can be exploited to aid in the mapping of genetically complex diseases. 
Both case-<ontrol and family-based methods can be used. The latter have several 
advantages, especially v^en tested using the TDT or its variant, the sib-TOT. The 
TDT is both simple and powerful, and it will have substantial power for detection of 
susceptibiUty alleles as better and more finely spaced markers are available for 
study. 
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